cooperation level
Q-Learning-Driven Adaptive Rewiring for Cooperative Control in Heterogeneous Networks
Cooperation emergence in multi-agent systems represents a fundamental statistical physics problem where microscopic learning rules drive macroscopic collective behavior transitions. We propose a Q-learning-based variant of adaptive rewiring that builds on mechanisms studied in the literature. This method combines temporal difference learning with network restructuring so that agents can optimize strategies and social connections based on interaction histories. Through neighbor-specific Q-learning, agents develop sophisticated partnership management strategies that enable cooperator cluster formation, creating spatial separation between cooperative and defective regions. Using power-law networks that reflect real-world heterogeneous connectivity patterns, we evaluate emergent behaviors under varying rewiring constraint levels, revealing distinct cooperation patterns across parameter space rather than sharp thermodynamic transitions. Our systematic analysis identifies three behavioral regimes: a permissive regime (low constraints) enabling rapid cooperative cluster formation, an intermediate regime with sensitive dependence on dilemma strength, and a patient regime (high constraints) where strategic accumulation gradually optimizes network structure. Comparative analysis against Bush-Mosteller stimulus-response learning demonstrates that Q-learning's temporal credit assignment capabilities produce superior cooperation outcomes, particularly under intermediate rewiring constraints where long-term relationship assessment becomes crucial. Quantitative analysis reveals that increased rewiring frequency drives large-scale cluster formation with power-law size distributions. Our results establish a new paradigm for understanding intelligence-driven cooperation pattern formation in complex adaptive systems, revealing how machine learning serves as an alternative driving force for spontaneous organization in multi-agent networks. Introduction Ensuring cooperative control in distributed engineered systems and applications is a daunting challenge across diverse domains. In distributed resource management, cooperative agents must dynamically adapt to balance local demands and maintain global performance [1]; in urban traffic networks, intersections must exchange information to optimize flows [2, 3]; in robotic swarms, unmanned aerial vehicles or mobile robots must align actions for collective tasks under uncertainty [4, 5]. Apparently, in each case, the performance of the overall system, including throughput, latency, reliability, and safety, depends on the ability of autonomous agents to adapt strategies and restructure interactions in dynamic environments. Enhancing cooperation among agents is therefore essential, since insufficient coordination can lead to cascading failures, degraded performance, or even systemic collapse in critical infrastructures.
Bottom-Up Reputation Promotes Cooperation with Multi-Agent Reinforcement Learning
Ren, Tianyu, Yao, Xuan, Li, Yang, Zeng, Xiao-Jun
Reputation serves as a powerful mechanism for promoting cooperation in multi-agent systems, as agents are more inclined to cooperate with those of good social standing. While existing multi-agent reinforcement learning methods typically rely on predefined social norms to assign reputations, the question of how a population reaches a consensus on judgement when agents hold private, independent views remains unresolved. In this paper, we propose a novel bottom-up reputation learning method, Learning with Reputation Reward (LR2), designed to promote cooperative behaviour through rewards shaping based on assigned reputation. Our agent architecture includes a dilemma policy that determines cooperation by considering the impact on neighbours, and an evaluation policy that assigns reputations to affect the actions of neighbours while optimizing self-objectives. It operates using local observations and interaction-based rewards, without relying on centralized modules or predefined norms. Our findings demonstrate the effectiveness and adaptability of LR2 across various spatial social dilemma scenarios. Interestingly, we find that LR2 stabilizes and enhances cooperation not only with reward reshaping from bottom-up reputation but also by fostering strategy clustering in structured populations, thereby creating environments conducive to sustained cooperation.
Cooperation and Personalization on a Seesaw: Choice-based FL for Safe Cooperation in Wireless Networks
Zhang, Han, Elsayed, Medhat, Bavand, Majid, Gaigalas, Raimundas, Ozcan, Yigit, Erol-Kantarci, Melike
Federated learning (FL) is an innovative distributed artificial intelligence (AI) technique. It has been used for interdisciplinary studies in different fields such as healthcare, marketing and finance. However the application of FL in wireless networks is still in its infancy. In this work, we first overview benefits and concerns when applying FL to wireless networks. Next, we provide a new perspective on existing personalized FL frameworks by analyzing the relationship between cooperation and personalization in these frameworks. Additionally, we discuss the possibility of tuning the cooperation level with a choice-based approach. Our choice-based FL approach is a flexible and safe FL framework that allows participants to lower the level of cooperation when they feel unsafe or unable to benefit from the cooperation. In this way, the choice-based FL framework aims to address the safety and fairness concerns in FL and protect participants from malicious attacks.
Cooperative bots exhibit nuanced effects on cooperation across strategic frameworks
Si, Zehua, He, Zhixue, Shen, Chen, Tanimoto, Jun
The positive impact of cooperative bots on cooperation within evolutionary game theory is well documented; however, existing studies have predominantly used discrete strategic frameworks, focusing on deterministic actions with a fixed probability of one. This paper extends the investigation to continuous and mixed strategic approaches. Continuous strategies employ intermediate probabilities to convey varying degrees of cooperation and focus on expected payoffs. In contrast, mixed strategies calculate immediate payoffs from actions chosen at a given moment within these probabilities. Using the prisoner's dilemma game, this study examines the effects of cooperative bots on human cooperation within hybrid populations of human players and simple bots, across both well-mixed and structured populations. Our findings reveal that cooperative bots significantly enhance cooperation in both population types across these strategic approaches under weak imitation scenarios, where players are less concerned with material gains. However, under strong imitation scenarios, while cooperative bots do not alter the defective equilibrium in well-mixed populations, they have varied impacts in structured populations across these strategic approaches. Specifically, they disrupt cooperation under discrete and continuous strategies but facilitate it under mixed strategies. These results highlight the nuanced effects of cooperative bots within different strategic frameworks and underscore the need for careful deployment, as their effectiveness is highly sensitive to how humans update their actions and their chosen strategic approach.
Polarize, Catalyze, Stabilize: How a minority of norm internalizers amplify group selection and punishment
Odouard, Victor Vikram, Smirnova, Diana, Edelman, Shimon
Many mechanisms behind the evolution of cooperation, such as reciprocity, indirect reciprocity, and altruistic punishment, require group knowledge of individual actions. But what keeps people cooperating when no one is looking? Conformist norm internalization, the tendency to abide by the behavior of the majority of the group, even when it is individually harmful, could be the answer. In this paper, we analyze a world where (1) there is group selection and punishment by indirect reciprocity but (2) many actions (half) go unobserved, and therefore unpunished. Can norm internalization fill this "observation gap" and lead to high levels of cooperation, even when agents may in principle cooperate only when likely to be caught and punished? Specifically, we seek to understand whether adding norm internalization to the strategy space in a public goods game can lead to higher levels of cooperation when both norm internalization and cooperation start out rare. We found the answer to be positive, but, interestingly, not because norm internalizers end up making up a substantial fraction of the population, nor because they cooperate much more than other agent types. Instead, norm internalizers, by polarizing, catalyzing, and stabilizing cooperation, can increase levels of cooperation of other agent types, while only making up a minority of the population themselves.
Modeling Prejudice and Its Effect on Societal Prosperity
Mohan, Deep Inder, Verma, Arjun, Rao, Shrisha
Existing studies on prejudice, which is important in multi-group dynamics in societies, focus on the social-psychological knowledge behind the processes involving prejudice and its propagation. We instead create a multi-agent framework that simulates the propagation of prejudice and measures its tangible impact on the prosperity of individuals as well as of larger social structures, including groups and factions within. Groups in society help us define prejudice, and factions represent smaller tight-knit circles of individuals with similar opinions. We model social interactions using the Continuous Prisoner's Dilemma (CPD) and a type of agent called a prejudiced agent, whose cooperation is affected by a prejudice attribute, updated over time based both on the agent's own experiences and those of others in its faction. Our simulations show that modeling prejudice as an exclusively out-group phenomenon generates implicit in-group promotion, which eventually leads to higher relative prosperity of the prejudiced population. This skew in prosperity is shown to be correlated to factors such as size difference between groups and the number of prejudiced agents in a group. Although prejudiced agents achieve higher prosperity within prejudiced societies, their presence degrades the overall prosperity levels of their societies. Our proposed system model can serve as a basis for promoting a deeper understanding of origins, propagation, and ramifications of prejudice through rigorous simulative studies grounded in apt theoretical backgrounds. This can help conduct impactful research on prominent social issues such as racism, religious discrimination, and unfair immigrant treatment. This model can also serve as a foundation to study other socio-psychological phenomena in tandem with prejudice such as the distribution of wealth, social status, and ethnocentrism in a society.
Cooperation-Aware Reinforcement Learning for Merging in Dense Traffic
Bouton, Maxime, Nakhaei, Alireza, Fujimura, Kikuo, Kochenderfer, Mykel J.
Decision making in dense traffic can be challenging for autonomous vehicles. An autonomous system only relying on predefined road priorities and considering other drivers as moving objects will cause the vehicle to freeze and fail the maneuver. Human drivers leverage the cooperation of other drivers to avoid such deadlock situations and convince others to change their behavior. Decision making algorithms must reason about the interaction with other drivers and anticipate a broad range of driver behaviors. In this work, we present a reinforcement learning approach to learn how to interact with drivers with different cooperation levels. We enhanced the performance of traditional reinforcement learning algorithms by maintaining a belief over the level of cooperation of other drivers. We show that our agent successfully learns how to navigate a dense merging scenario with less deadlocks than with online planning methods.